home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Collection of Internet
/
Collection of Internet.iso
/
infosrvr
/
dev
/
www_talk.930
/
000552_connolly@pixel.convex.com _Tue Jan 12 23:10:06 1993.msg
< prev
next >
Wrap
Internet Message Format
|
1994-01-24
|
3KB
Return-Path: <connolly@pixel.convex.com>
Received: from dxmint.cern.ch by nxoc01.cern.ch (NeXT-1.0 (From Sendmail 5.52)/NeXT-2.0)
id AA26846; Tue, 12 Jan 93 23:10:06 MET
Received: by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
id AA09740; Tue, 12 Jan 1993 23:25:13 +0100
Received: from pixel.convex.com by convex.convex.com (5.64/1.35)
id AA13978; Tue, 12 Jan 93 16:25:09 -0600
Received: from localhost by pixel.convex.com (5.64/1.28)
id AA02411; Tue, 12 Jan 93 16:25:08 -0600
Message-Id: <9301122225.AA02411@pixel.convex.com>
To: "Thomas A. Fine" <fine@cis.ohio-state.edu>
Cc: timbl@nxoc01.cern.ch, www-talk@nxoc01.cern.ch
Subject: Re: HTML todo list
In-Reply-To: Your message of "Tue, 12 Jan 93 16:42:55 EST."
<9301122142.AA13427@soccer.cis.ohio-state.edu>
Date: Tue, 12 Jan 93 16:25:07 CST
From: Dan Connolly <connolly@pixel.convex.com>
>I'd also like to express the opinion that we shouldn't make the HTML so
>terribly specific. Every new situation shouldn't require another addition
>to HTML. If that were the case, we'd never be finished.
Well, suffice it to say that HTML doesn't give much semantic
information. It would be nice to express relationships between
pieces of information through the document structure, but in
HTML we mostly use links.
>>[...] The same is true for newlines: it's
>>illegal to treat
>><foo>
>>content
>></foo>
>>different from <foo>content</foo> because the difference is
>>not reported by the parser (unless we do some shortref magic
>>to force the parser to report the difference.)
>
>I don't think we should do any shortref magic. The simplest thing
>(the way it works now) is that the two examples above are identical.
>I say this is fine.
But it's a royal pain to implement! Doing full SGML newline processing
by the standard is quite involved (see the article by Eric Naggum
in comp.text.sgml about SGML and Records that I referenced in
an earlier message). For example, you can't just get rid of all
newlines immediately before or after tags, like it says in the
web: Only those right after a start tag (of a non-empty element),
right before an end tag,
or the ones on a line containing only comments and processing instructions.
Newlines around <P> tags, for example, _are_ reported.
If we don't stick the SHORTREF magic in the DTD to force the
parser to report all newlines, we'll end up with countless hacks
at newline processing, none of which matches the standard, and
it'll be luck if any of them matches each other.
Dan